Note: Episodes listed below are ordered based on how likely they are to match your search request.
"Their cofounder and chief scientist Guillaume Lempo writes, today we are releasing Mistral large. Our latest model, Mistral large, is vastly superior to Mistral medium, handles 32k tokens of context, and is natively fluent in English, French, Spanish, German and Italian. We have also updated Mistral small on our API to a model that is significantly better and faster than Mixtral eight x seven B. Now, from a performance standpoint, the numbers are looking really promising, with Mistral large performing only below GPT four and ahead of Claude two and Gemini Pro. What's more, the fact that it is natively multilingual is something that people are taking note of."
"Open source AI models have completely changed the landscape of technology over the past year. One tiny team of ex DeepMind and meta researchers in France has made a huge splash recently. Mistral. This week, Alad and I are joined by Arthur Munch, the CEO and co founder of Mistral, who recently released Mistral Seven B, an Apache two licensed open source model that has changed people's mental models about what can be done with small models. Arthur, welcome to knowpriers."
"And despite being only a year old, Mistral is a serious contender. Now, in addition to that unconfirmed funding news, we also got more model updates from Mistral who uploaded their eight x 22 B model, as well as a brand new instruct eight x 22 b with function calling on hugging face. The model is fluent in English, French, Italian, German, and Spanish. It's a 141 billion parameter model. It has a 64,000 token context window, and it's released under the same Apache 2.0 license that they've released previous models on, which includes, of course, an ok for commercial use."
"Now, something I've said on this show before is that up until Mistral started releasing its models, Meta was the undisputed open source champ in terms of dominating where developer attention was. With the release of Mistral models, they definitely started to suck some of the oxygen out of Meta's room. And so this is a chance for Meta both to reclaim that open source space, but also to try to push into the general LLM performance space as well. Next up, something at the intersection of AI and geopolitics. One of the big questions right now in us policy is whether TikTok is going to be outright banned or, as many are advocating, forced into a sale to a us owned company for the US facing version."
"What sets Mistral AI apart is its open source approach to technology, which distinguishes it from other LLM developers like OpenAI and Anthropic. This strategy has paid off, with the company's valuation skyrocketing from $2 billion in December to an impressive $6 billion now. Investors are taking notice, with significant contributions expected from big names like General catalyst, Lightspeed Venture Partners, and DST Global. Mistral AI's success can be attributed to its innovative products such as the Mictrol eight X 22 B model, which features is a mixture of experts architecture that enhances efficiency and reduces hardware usage. The company's strategic partnerships, like its collaboration with Microsoft to integrate its models into the Azure cloud platform, have further solidified its position in the market."
"They have a few different ones here that you can use, like large. Next, small. And this one is excellent. So what's interesting about the TMAT Mistral is, I would say from an open source perspective, they have a model called Mistral Mixtral. And basically it's a mixture of experts."
"And then last but not least, a new open source model has leaked and it's pretty impressive. So there is a recent discovery of a seemingly new open source large language model on hugging face, referred to as MYQ 170 B. All right. And its potential really was to rival or even exceed the current top performing model, which we've all heard of, this little model called OpenAI's GPT four. So the CEO of Mistral confirmed the leak on Twitter X, whatever we're calling it, saying it was actually based off of an older model of Mistral."
"Moe and Gigazine writes, although details are unknown, eight x 22 Bmoe may have more than three times the number of parameters of the model. Mistral eight x seven B, which has been shown to outperform GPT 3.5 and llama 270 b in many benchmarks. They also add the total number of parameters may be up to 176 billion. The context length that can be handled is said to be 65k. Now the open source community is really excited about this one, not only because it appears that it might have increased capacity, but also because the last Mistral model that was announced was their first closed source model, which was to be distributed exclusively through Microsoft."
"Mistral has some very relevant models, commercial and open source, and it has a very strong developer platform that allows to do everything that you need to create your AI application. That would be a good achievement. Arthur, listen, I've so enjoyed doing this. Thank you for putting up with me going in many different fast moving directions. You've been incredibly patient and a brilliant guest."
"Which model? Using Mistral or Lama tour? Like yeah. What would be your best suggestions for someone to get started and build an app like this? So I am very biased in the sense that the Mistral team, they're formerly from meta but also from DeepMind, super talented folks."
"So let's fast forward to today. That's December 2023. We'll get to the role of open source in a bit, but let's just level set on what you've built so far. A couple of months ago, you released Mistral Seven B, which was a best in class dense model. And this week you're releasing a new mixture of experts model."
"To put that into perspective, that's a whopping $3 billion jump from its $2 billion valuation just a few months ago. So what's the secret behind Mistral AI's rapid growth? For starters, the company has developed a cutting edge mixture of experts architecture for its latest open source large language model, Mixtrol eight x 22 B. This innovative approach has allowed Mistrol AI to achieve competitive performance while reducing hardware usage as it activates only the necessary neural networks for a given prompt. It's like having a team of specialized experts working together seamlessly to tackle complex tasks."
"Paris based startup Mistral is less than a year old, but it's already making big waves in the AI field. Mistral makes AI models similar to the ones that power AI tools like Chat, GPT, and Gemini. And on Monday, Microsoft announced a partnership with the startup. As part of the deal, Microsoft will take a small stake in the company. WSJ tech reporter Sam Schechner joins us now to talk about what Mistral's entry into the ecosystem means for the future of AI."
"Yes. And then sliding window. Not yet. So it's in Mistral, but we maybe need to see it in more work. My conspiracy theory about Mistral is that."
"Communities and customers broad access so first others can train their models and deploy them on the AI data center infrastructure we're creating. This is illustrated by the announcement we made just 2 hours ago. What I think really stands for a new day and a new era for Microsoft support for the development of technology in Europe. It's the multi year partnership we just announced with Mistral AI, the leading AI company in France. And under this partnership, Mistral AI will now be able to train and deploy its leading models in our data center."
"So now with a model that's based on Mistral and Siglip, most probably this is a lot better for anyone that wants to use the model commercially. The model will also be smaller, so a lot better for inference and hopefully we can beat the performance of Vdefix ATB. That would be really good. So you're only producing a nine B? Not exactly nine B because we're taking off parameters."
"That's going to result, I think, in a renaissance of new kinds of chips that are capable of handling massive workloads of inference on device. We are yet to see those unlocked. But the good news is open source models are phenomenal at unlocking efficiency. The open source language model ecosystem is just so ravenous. When Mistral's new model came out, their mixture of experts open source model a few months ago, I think it literally took less than 24 hours for somebody to quantize it and then add GGML support so that you could run locally within the week."
"And the open source models are a particularly disruptive play here, launched by meta, and then Mistral has really good numbers out of know. To the extent that Amazon and others can host those without fees, it would be very I personally, based on everything I've studied in conversations with our friend Sonny, I do think that the variables that have been used to drive the nonlinear growth, both in the parameter count and in the window where they study relevance, I think those are probably topped out and they're trying to differentiate based on merging multiple models, but the user experience on that is janky and slower. And so I do think there's a chance that we're up against on the llms, up against a bit of a ceiling, and we'll see though. And I think that's why the startups that you're seeing are finding that there's not much difference because they're all kind of heading towards the same ceiling. Well, I will say coming out of last summer, we already started to see a cooling in terms of valuations and uprounds for these AI startups."
"That seems obvious to me, but it is not necessarily shared intuition for everybody. My other question is, I think what's happening here with these models that I've tried is that they are probably falling into the trap that we talked about at the beginning, which is they're starting with an open source model. I'll bet it's like llama, or perhaps Mistral. Now, there was, if it's llama, I don't know as much about exactly what Mistral includes or doesn't in the default package. But llama two chat has a pretty robust refusal."
"We exclusively report that Mistral AI is nearing a deal to raise funds at a $6 billion valuation, nearly tripling its level from six months ago. Existing backers, general Catalyst and Lightspeed Venture partners are expected to be among the biggest investors in the new funding round, in which Mistral is set to raise about $600 million. That's according to people familiar with the situation. And in another exclusive, we report that T Mobile and Verizon are in discussions to carve up us cellular, one of the country's last major regional wireless carriers, in separate transactions that would give both buyers access to valuable airwaves. T Mobile is closing in on a deal that would take over some operations and wireless spectrum licenses."